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It is known by the experience gained from the gravitational 
wave detector proto-types that the interferometric output sig- 
nal will be corrupted by a significant amount of non-Gaussian 
noise, large part of it being essentially composed of long-term 
sinusoids with slowly varying envelope (such as violin res- 
onances in the suspensions, or main power harmonics) and 
short-term ringdown noise (which may emanate from servo 
control systems, electronics in a non-linear state, etc.). Since 
non-Gaussian noise components make the detection and esti- 
mation of the gravitational wave signature more difficult, a de- 
noising algorithm based on adaptive filtering techniques (LMS 
methods) is proposed to separate and extract them from the 
stationary and Gaussian background noise. The strength of 
the method is that it does not require any precise model on 
the observed data : the signals are distinguished on the basis 
of their autocorrelation time. We believe that the robustness 
and simplicity of this method make it useful for data prepa- 
ration and for the understanding of the first interferometric 
data. We present the detailed structure of the algorithm and 
its application to both simulated data and real data from the 
LIGO 40meter proto-type. 
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I. INTRODUCTION 

Over the next decade, several large-scale interfero- 
metric gravitational wave detectors will come on-line. 
These include LIGO, composed of two Laser Interfer- 
ometer Gravitational-wave Observatories situated in the 
U.S. 1], VIRGO, a French/Italian project located near 
Pisa [2 1, GEO600, a German/British interferometer un- 
der construction near Hannover H, TAMA in Japan, a 
medium-scale laser interferometer [|l|, and with funding 
approval AIGO500, the proposed 500 meter project spon- 
sored by ACIGA. There are also separate proposals for 
space-based detectors which could be operational twenty- 
five years from now (e.g., LISA: the Laser Interferometer 
Space Antenna, a cornerstone project of the European 
Space Agency M). In the meantime, a number of exist- 
ing resonant bar detectors will have had their sensitivities 
further enhanced. 

The key to gravitational wave detection is the very 
precise measurement of small changes in distance. For 
laser interferometers, this is the distance between pairs of 
mirrors hanging at either end of two long, mutually per- 



pendicular vacuum chambers. Gravitational waves pass- 
ing through the instrument will shorten one arm while 
lengthening the other. By using an interferometer de- 
sign, the relative change in length of the two arms can 
be measured, thus signaling the passage of a gravitational 
wave at the detector site. Long arm lengths, high laser 
power, and extremely well-controlled laser stability are 
essential to reach the requisite sensitivity, since the grav- 
itational waves will be faint and will modify only weakly 
the structure of space-time in the detector's arms (see 
e.g., !). 

Gravitational wave detectors produce an enormous vol- 
ume of output (e.g., of the order of 16 MB/sec for the 
LIGO instruments) consisting mainly of noise from a host 
of sources both environmental and intrinsic to the appa- 
ratus. Buried in this noise will be the gravitational wave 
signature. Sophisticated data analysis techniques will 
have to be developed to optimally extract astrophysical 
data. Many of the techniques developed so far [j7]-[| are 
based on matched filtering and assume stationary Gaus- 
sian noise. 

However, the real data stream from the detectors is not 
expected to satisfy the stationary and Gaussian assump- 
tions. In fact, the data from the Caltech 40 meter proto- 
type interferometer has the expected broadband noise 
spectrum, but superposed on this are several other noise 
features 0; such as long-term sinusoidal disturbances 
emanating from suspensions and electric main harmonics 
and also transients occurring occasionally, typically due 
to servo-controls instabilities or mechanical relaxation in 
suspension system etc. While no precise a priori model 
can be given for this noise until the detector is completed 
and fully tested, matched filtering techniques cannot be 
used to locate/remove these noisy signals. 

This disparity between standard Gaussian assumptions 
and real data characteristics poses a major problem to 
the direct application of matched filtering techniques. 
This is true when searching for burst sources such as 
blackhole binary quasinormal ringings pp| . This is also 
the case for the inspiral searches in Caltech 40meter data, 
where one has to introduce a veto Q on the decision 
taken with the matched filter to ensure that the detected 
signal is actually the one we are looking for. 

It is possible that in the future, improved experimen- 
tal techniques and greater experience, will reduce or even 
completely eliminate some of these nonstationary and 
non-Gaussian features. Nevertheless, it will take proba- 
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bly some time to reach such acceptable and high quality 
of data. Therefore, it is necessary and desirable to some- 
how combat this noise. Since such noise features defy 
modeling, a novel approach to the problem is called for. 

We propose a denoising method based on LMS adap- 
tive linear prediction techniques which does not require 
any precise a priori information about the noise char- 
acteristics. Although our method does not pretend to 
optimality, we believe that its simplicity makes it use- 
ful for data preparation and for the understanding of the 
first data. 

In the following, we present the principles of LMS 
adaptive denoising (Sect. Q|), a characterization of its 
behavior on a simple model of the noise from the inter- 
ferometer (Sect. |lTj|) , the precise structure of the denois- 
ing algorithm (Sect. CV) and results (Sect. ^) obtained 
with simulated data and also with real data taken from 
the Caltech 40 meter proto-type interferometer . 

This work here is preliminary; its goal is to explore 
how effectively adaptive filtering techniques perform on 
the problem we address. It is a first step towards a more 
complete statistical evaluation of the algorithm. 



II. METHODS 
A. From hypothesis to method 

We assume that the noise consists of broadband Gaus- 
sian noise plus large amplitude oscillating interference 
signals. The model does not include any a priori knowl- 
edge of the signal such as its exact frequency or shape 
of the envelope. The only assumption we make is that 
its autocorrelation over a small time-lag - the time-lag 
chosen greater than the decorrelation time scale of the 
broadband noise - is appreciable, while for the broad- 
band noise it is essentially zero. This difference can be 
used to advantage to discriminate between the narrow 
band interferences and the broadband noise. 

The idea is to predict the current signal sample given 
the previous samples of the data. This is possible, only if 
the target sample shares enough information with (i.e., is 
sufficiently correlated to) the previous samples. In other 
words, the only predictable part of the signal is the one 
whose correlation length is sufficiently large (i.e., long- 
term sinusoids or ringdowns). Conversely, broadband 
noise cannot be predicted, as it is not possible to guess 
the next value in this way. It is this crucial underlying 
idea we use to discriminate between the two noise signals. 



B. Mean square linear prediction 

Let us recall some standard principles to design an 
optimal linear predictor. The question to address is to 
optimally predict the data sample x k with a collection 
of past samples x k := (xk~d-immn — 0, 1, . . . , N — l) f 



given the delay d > 1 (the quantity d is also referred to 
as prediction depth). The prediction is obtained by lin- 
early combining these data samples weighted by the N 
corresponding coefficients^ u/ m ) , forming the tap- weight 
vector w := {w^ n \ m = 0, 1, . . . , N — 1)', where the su- 
perscript 't' denotes the transpose of the vector. There- 
fore, the prediction y k of Xk reads, 



Vk 



w x k . 



(1) 



The predictor is optimal in the mean square sense when 
the variance of the prediction error e k = x k — yu is mini- 
mum. Therefore, the problem is to find the set of weight 
coefficients which minimizes 



Jk(w) :=E[el}=mx k -w t x k )% 



(2) 



where E denotes the expectation value operator. 

This leads to the minimization of the following 
quadratic form 



J k (w) = u\- 2w t p k + w t R k w, 



(3) 



where cr^ 



E[x|], p k := E[x k x k ] and R k := E[aj£xjt]. 
There exists only one solution w* kl obtained when the 
gradient of J k vanishes. This situation is realized when 



Rkw* k =p k . 



(4) 



When the signal is stationary, R k — R and p k = p are 
constant (independent of k). In this case, R defines the 
autocorrelation matrix of the signal x k and the solution 
of (^) is referred to as the Wiener filter. 



C. Linear prediction and LMS method 

Eq. (|J) requires the computationally expensive inver- 
sion of the matrix R k . An alternative and more efficient 
solution for finding the minimum of the (§) consists in 
starting from an arbitrary initial value w ky $, and iterate 
the tap-weight vector along the steepest descent direc- 
tion, 

W k>n +1 = W k . n - jJ,V w Jk(Wk,n), (5) 

given by the gradient 



\7 w J k {w) = 2(R k w -p k ). 



(6) 



For a sufficiently small gain /x, the weight vectors 
will eventually converge to the optimal predictor filter 
w k . This procedure requires the second order statistics 
(namely R k and p k ) of the signal. In our case, this infor- 
mation is not available and one has therefore to estimate 



1 We put brackets around indices of vectors and matrices in 
order to distinguish them from the time index. 
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these quantities. Instead of estimating directly R k and 
p k and combining them with (||), a more efficient solu- 
tion is to estimate the gradient. From the derivation of 
(|J), one can rewrite the gradient as 

Vw Jk(w) = -2E[e fc sc fc ]. (7) 

A simple and natural way to obtain an estimator of 
this quantity is to omit the expectation operator : 

V^Jfc = -2e k x k . (8) 

Because the noise perturbs this estimate, the algorithm 
may iterate in a direction which does not lie along the 
direction of steepest descent, thus preventing the filter 
from converging to the Wiener filter. For this purpose, 
we stabilize the estimation above by setting the algorithm 
time index n equal to signal time index k in the Eq. (|^) . 
The final evolution equation for the tap-weight vector 
finally reads: 

w k+1 =w k + 2fie k x k . (9) 

At a fixed time k, the weight vector evolves along the 
crude estimate of the steepest descent direction. But on a 
longer duration, the direction followed by the tap-weight 
vector is governed by the sum of the successive gradient 
estimates obtained with different noise samples. In other 
words, we have replaced an ensemble average in (0) by 
a time average. It also implies that we have implicitly 
called for further assumptions on the signal x k : first its 
local stationarity (more precisely, the second order statis- 
tics are supposed to be constant during the convergence 
time of the algorithm) and second, its ergodicity. 

Summarizing, the method we propose consists in lin- 
early filtering the data to extract the part of the sig- 
nal with a long correlation time. As illustrated with the 
block diagram in Fig. |], the finite impulse response fil- 
ter (given by w k ) is modified at each iteration according 
to the relation (0) with the final goal to minimize the 
mean square error. Once the filter has converged (i.e., 
w k is stable in time), we reject the predicted part of the 
signal (corresponding to the long-term sinusoidal or the 
ringdown signals) and we send the rest of the signal for 
further analysis for detection. 

D. Properties of the LMS method 

The method we described above is referred to as adap- 
tive line enhancer (ALE). It is a special case of the LMS 
algorithm. Both, ALE and LMS algorithms have been 
first introduced by Widrow and Hoff [|l2| in the 1960's. 

The acronym LMS (Least Mean Square) designates 
a general scheme to design signal processing methods 
where a minimization (in a statistical sense) of a definite 
positive quadratic cost function (usually related to some 
mean quadratic error) is needed. Its central idea is the 
use of the estimate of the gradient of this function given 



in Eq. (||). The LMS technique has been extensively 
used for the last 30 years in communications problems 
such as echo cancellation, channel equalization, antenna 
processing, etc. The main advantages to be gained by 
applying the LMS technique are (i) adaptivity, (ii) ro- 
bustness, (Hi) simplicity. 

In this context, the term "adaptivity" has two different 
meanings. First, it means that the LMS technique will 
automatically modify its parameters to reach for the best 
setup for a problem which has not been initially precisely 
defined. Second, it is also able to follow changes in the 
characteristics of the data being processed in the event 
that they occur. The latter property also shows that the 
method is robust. In fact, this method has been proved 
to be robust according to specific statistical criterion such 
as the minimax criterion fig] . 

The ALE is an adaptive prediction algorithm using the 
LMS technique. We have seen that the signal is predicted 
from a reference signal which is the signal itself. In some 
other applications, although the same principles are ap- 
plied, the reference signal can be another signal, e.g. echo 
cancellation or denoising. In such cases, the quantity of 
interest might not be the prediction output but the linear 
filter used to compute it, e.g., deconvolution. 

III. ADAPTING ALE FILTER TO CANCELING 
NOISE IN GW DATA 

In this section we essentially describe a model for un- 
derstanding the behaviour of the ALE algorithm. The 
model we assume consists of a high amplitude narrow- 
band signal superposed on broadband noise. For sim- 
plicity, we assume the broadband noise to be white and 
Gaussian and the narrowband signals are sinusoids of 
constant envelope. The results we obtain hold for more 
realistic signals when the evolution of their amplitude 
and/or instantaneous frequency occurs adiabatically, i.e., 
the change is small over the period of the sinusoid. 

The assumption of white noise is not too restrictive 
because this is equivalent to choosing the noise correla- 
tion time to be zero and therefore we are free to choose 
the prediction depth (i.e., the time delay between the cur- 
rent predicted data sample and the reference signal to the 
LMS filter) to be arbitrarily small. In a real situation, 
we must fix the delay to be greater than the correlation 
time of the broadband noise. We first analyze the case 
of the sinusoid because it is easier to investigate and pro- 
vides invaluable insights into the workings of the LMS 
algorithm. 

It may be remarked that the denoising of sinusoids 
in white noise has been treated in the literature with 
great detail (see for a review). We give here only 

pertinent results (with a short proof) for introducing the 
structure of the algorithm, which we present later in the 
text. 
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A. Optimal filter 

We consider the data to be of the following form, 

x k := cos(2irf t k + $) + n k , (10) 

where t k '■— kS, S := 1/ f s being the sampling interval 
and $ is a random phase (at the origin) with uniform 
probability density function between —n and n. The si- 
nusoid has frequency fo and the units are so chosen that 
it is of unit amplitude. The additive white noise n k with 
variance a 2 satisfies the relation, 

E[n fc n m ] := a 2 8 kmi (11) 

where Sk m is the Kronecker delta. 

The reference signal to the adaptive filter is just the 
delayed data by the amount d5, where d is the number 
of time samples. We choose N weights w k = W (w 
can be thought of as a column vector) for the length of 
our filter, then the "reference vector" x k at the kth time 
instant t k has the components Xk-d-n, n. = 0, 1, . . . ,N — 
1. The components of the autocorrelation matrix R and 
the vector p in Eq.(^) are given by 

R (m„) = j i 2 CQS ( m _ n j S(j) + a 2 §mn ^ 12 j 

= 1 /2 cos(m + d)<ty, (13) 

where (m, n) = 0, 1, . . . , N — 1 and 50 = 2irf 5. Note 
that we have dropped the index k because the autocor- 
relation R docs not depend upon fc, since we are dealing 
with a stationary signal. 

From the above expressions of R and p and solving 
Eq.(^), we obtain the optimum Wiener filter 

w* {m) = — - cos(m + d)8(j), m = 0, 1, . . . , N - 1. 

(14) 

where we have chosen the length of the filter to be half- 
integral number of cycles for reasons of simplicity, i.e. 
N5(j> = In, where I is an integer. 

In other words, the optimum linear predictor is nothing 
but a copy of the expected signal itself. The filter in Eq. 
( |l4| ) is also referred to as matched filter. In our situation, 
in practise, N a 2 and the term 4cr 2 can be omitted 
from the amplitude of w*. 

For the reasons detailed before, we propose to use the 
ALE algorithm in order to find a good approximation of 
w* . Starting from an arbitrary initial tap- weight vec- 
tor, we iterate the weights w k according to Eq. (||) to 
converge to w* . Once the filter is "close" enough to the 
optimal solution (the word "close" will be defined later 
in the text), we then say that the filter has locked on to 
the signal. 



B. Approach to locking 

a. Continuous time approximation of the locking tra- 
jectory - We may analyze the approach to locking by 
deriving a difference equation for the averaged evolution 
of the weights and then investigating this equation. It is 
impossible to obtain the average evolution of the weights 
by using the standard definition of the expectation op- 
erator E because of the nonlincarity and the recursive 
scheme involved in evolving the weights. We therefore 
adopt the time-average over successive data points as the 
operational definition of E. 

Shifting the origin to w* by defining Vk :— Wk — w* , 
we may write the LMS evolution equation in the fol- 
lowing form p5| : 

Vk+i - v k = -2fi(x k xl)v k + 2iie* k x k , (15) 

where e* k := x k — w^x/- is the prediction error produced 
when using the optimal filter. 

During the locking phase, the filter is far apart from 
the optimal location (i.e., v k has a large modulus). The 
homogeneous term dominates the forcing term in the dif- 
ference equation (|l5| ) which then can be approximated 
by: 

Vk+i -v k = ~2\x{x k x k )v k . (16) 

In the situation where the step gain parameter /i is 
chosen to be very small so that the weight coefficients are 
almost constant over a given time interval, the recursivity 
eventually acts as an averaging operation on both sides of 
the equation above. This leads to the difference equation 
which we use to describe the tap- weight trajectory in the 
space of weight coefficients, we denote W : 

v k+1 - v k = -2^iRv k . (17) 

Let Q be the transformation which diagonalizes R. 
The above difference equation is best analyzed by chang- 
ing the frame in W to the principal axis 

v k +i -Vk = -2fiRv k , (18) 

where R := QRQ 1 = diag(A< \ . . . , A^ -1 -*) and v k := 
Qv k . Eq. (|l^) gives decoupled difference equations for 
the components v^\m = 0, . . . , N — 1 of v which can 
be solved given the initial weight vector i>g"^ : 

4 m) =4 m) (l-2/xA^) fe . (19) 

b. Eigenvalues and eigenvectors of R We need to 
compute the eigenvalues A^™- 1 of R. This can be conve- 
niently implemented by splitting R into the noise part, 
which is just cr 2 times the identity plus the signal part 
which we denote by S/2, and thus, 

R = a 2 I + 5/2, (20) 
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where S^ mn ' :— cos(m— n)d<p. It is easily verified that the 
eigenvectors of R and S are identical and the eigenvalues 
of R are obtained from those of S by first halving them 
and then adding a 2 to the result. It remains, therefore, 
to compute the eigenvalues and eigenvectors of S. We do 
this by observing that we can write S as follows n: 



S = {^ + 99^/2, 



(21) 



,exp((7V 



where v := (1, exp(iS(f>), exp(2iS(f)), 
l)iSf))*. 

Since the matrix S is real and is essentially made out 
of two external products of v and 9, its rank equals 2 (S 
has N — 2 degenerate eigendirections in W with eigen- 
value zero) with two non-zero real eigenvalues. Let v be 
an eigenvector associated to one of the non-trivial eigen- 
values. According to the structure of S, the vector v 
can be written without loss of generality as the following 
linear combination, 



v = v cxp(— ia) + v exp(ia), 



(22) 



where the coefficients have been chosen to have unit mod- 
ulus arbitrarily. 

Using the two scalar products v'v = N and 

v*u = 1 + exp(2i(ty) + . . . + exp(2(iV - l)i8<f>) (23) 



:= /3exp ij, 



(24) 



where the geometric series can be summed up and mod- 
ulus and phase ascertained : 



= 



sin(iV(5</>) 



sin d<p 
7 = (N - l)Scj>, 



(25) 
(26) 



we obtain the effect of the matrix S on the vector v, 
given by 

Sv = v exp(-ia)(N + f3cxp(-ij + 2ia))/2 + c.c. (27) 

where c.c. denotes complex conjugate. 

This expression has to be compared to the second term 
of the eigenvalue equation Sv = Xv, leading to two solu- 
tions for a, namely, a — 7/2 and a — (7 — 7r)/2. These 
yield the eigenvectors v± and the corresponding eigen- 
values A± : 



v+ = i/exp(— ij/2) + i/exp(ij/2), 
V- = iisexp(—ij/2)—iuexp(i'y/2), 



A± = (N±(3)/2. 



(28) 



(29) 



2 By definition, the vectors x and := x l denote respec- 
tively the complex conjugate and the hermitian transpose of 
x. 



If we choose N large enough and N8<p = mir, where 
m is an integer then the analysis becomes simpler. This 
amounts to choosing the length of the filter to have half- 
integral number of cycles: we have (3 = and A± = N/2. 
(Geometrically, this means that the eigenvalue problem 
is degenerate with respect to the two signal eigenvec- 
tors: there is a two dimensional eigenspace belonging 
to the eigenvalue N/2. The weights thus evolve non- 
preferentially with respect to the signal eigendirections.) 
Since typical cases imply generally N 3> f3, we will as- 
sume this simplification in the rest of the paper. 

In this situation, the spectrum of R 



sp(.R) = {A^ ' = A' 1 -* = N/4 + a 1 



and 



a , m 



,N-1}, (30) 



consist of two sets of eigenvalues : the first two corre- 
spond to directions in the signal space associated to "sig- 
nal+noise" (or "signal", for short) whereas the remaining 
N — 2 characterize "noise" directions. 

According to Eq. (|l9|), the weight vector will converge 
more rapidly in directions associated with the largest 
eigenvalues, which are the signal eigenvalues. The other 
noise eigenvalues are unimportant in this consideration. 
The eigenvectors pertaining to the signal provide pre- 
ferred directions in W: it is along these directions that 
the slope of the performance surface is steep and hence 
promotes faster convergence. 



C. Steady state evaluation 

If the step gain factor is sufficiently small, the tap- 
weight coefficients eventually converge and stabilize in a 
neighbourhood of the optimal value. At this stage, the 
assumptions made in obtaining the approximated evolu- 
tion equation ( [l7|) do not hold anymore. In contrast with 
the case of "the approach to locking" , the right hand side 
of the difference equation ([l5]) is now dominated by the 
forcing term : 



Vk+i ~Vk = 2fie* k x k . 



(31) 



Roughly speaking, the trajectory of the vector w k dur- 
ing the steady state can be viewed as a random walk cen- 
tered around w* lying within a region of W space whose 
extent is determined by two factors, namely, /j, and the 
intrinsic geometry of W in the vicinity of w* . 

The misalignment between the actual ALE filter Wk 
and the optimal one w* creates an additional error in 
the output. In fact, a direct calculation from Eq. (||) 
shows that the total mean square error may decomposed 
as 111 : 



Jk(w k ) = £„ 



(32) 



where (i) £, m in ■— Jk{w*) is the minimum mean square 
error arising from the fraction of the input noise which 
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still remains in the output, assuming that the ALE filter 
has reached exact optimality and (it) := v l k R,Vk is the 
excess mean square error (EMSE) due to the misalign- 
ment between the ALE filter and the Wiener filter. 

One can verify that the EMSE vanishes when reaching 
optimality i.e., when Vk = 0. In other words, this term 
quantifies the non optimality of the current filter in use. 
We can imagine as the square of a natural distance in 
W and R as a intrinsic metric over W. 

A good approximation of £ TO j„ can be found for large 
number of weight coefficients for the specific case of sinu- 
soidal signals with high SNRs. Using Eqs. (^) and (Q), we 
may write £, m in — E^f.] — p t w* . When N — > oo, a direct 
calculation shows that the second term p t w* tends to the 
energy of the sinusoid, which means that the remaining 
energy is that of the noise: £ m i„ « er 2 . 

We complete the characterization of the mean square 
error ( |32| ) with the evaluation of the average value of the 
EMSE, which we denote by ^ st \ Firstly, noticing that 
the EMSE is invariant under the principal axis transfor- 
mation 



JV-l 



(33) 



m=0 



proposed in |12|] to obtain the typical value for (v^) 2 



and secondly, using the approximation E^-D^.] s=s p^ m inl 
propos 
yields 

jf(*t) 



N-l 

u V A (m) £ 



(34) 



m=0 



Since the signal is of much larger amplitude than the 
broadband noise, the trace of R is essentially due to the 
signal eigenvalues (see Eq. ^o|). Combining with the 
expression of ^ m in above, this leads to : 



^ {st) ~ pNa 2 /2. 



(35) 



A better estimate of ^ s '^ can be obtained starting with 
more realistic hypotheses and using more sophisticated 
approximations 



- /iA( m ) 



(36) 



In the limit of small step size, this approximation tends 
to the simpler one in Eq. (pa). 



D. Convergence time 

In the expression of the EMSE in Eq. ( |33| ) , we separate 

In) 

the sum into two parts: the first, associated with the 
noise (i.e., consisting of terms involving noise eigenvalues 
and vectors), the second, £^ with the signal. Because 



the signal eigenvalues are much larger than those of the 
noise, the sum in Eq. (|3^) is essentially dominated in 

the beginning (for small k) by . These two errors 
decrease during the locking phase until reaching a steady 
state value. The locking time (i.e., time at which the 
steady state is reached) is defined to be that, when ^ 
is of the order of the total EMSE expected in the steady 
state. 

From Eqs. @, @ and (||) we obtain, 



N 
T 



1 



fj,N 
~2~ 



21,- 



(37) 
(38) 



where we have assumed N 3> (3. 

We set the starting point Wq in W to be 0. It corre- 
sponds to the initial value Vq in the eigenspace which is 
given by i>o = —Qw* . The first two coordinates of 5o 
can be directly obtained since the first two row vectors 
of Q are just the normalized signal eigenvectors of the 
R matrix, leading to, for half integral wavelength filters 
and N > cr 2 , 



4 0) = ^/2jN cos(d8ct> + 7/2) 
4 15 = v /2 7^sin(d^ + 7/2). 
These considerations yield 



2k 



(39) 
(40) 



(41) 



This error must now be compared with the averaged 
EMSE in Eq. ( |35| ) in order to find the time ii oc k at which 
£( s ' and £( st ) are equal : 



^■lock 



. ln(fiNa 2 ) 
'21n(l - (J.N/2)' 



(42) 



It is important to mention that, when the product 
fiN/2 tends to 1, the convergence time diverges to in- 
finity meaning that the weights do not converge toward 
w* anymore. In order to ensure the stability of the al- 
gorithm, the parameters will have to satisfy the stability 
condition < fiN/2 < 1. However, we have observed 
in our simulations that when 1/2 < fiN/2 < 1, the con- 
vergence is slowed down, because of the presence of os- 
cillatory terms in the gradient which do not average to 
zero anymore. In practise, it is advisable to choose the 
parameters so that fj.N/2 < 1/2. 

For a sinusoid of amplitude A instead of unity as we 
have considered before, the condition for stability can 
be simply obtained by replacing the parameter [j,N/2 by 
p := fiNA 2 /2 leading to < p < 1. 

We illustrate in Figs. || and || with an example the re- 
sults of this Section pertaining to the approach to locking 
and steady state analysis. 
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IV. THE ALE IN PRACTICE 

In the previous Sections we have characterized the be- 
haviour of the ALE in cases of interest. We will now 
elaborate on how this algorithm can be adapted to the 
interferometric data. 

In the scheme we present here, we first decompose the 
signal in p frequency subbands to which we apply the 
ALE twice with different sets of parameters. In the first 
stage, the parameters are tuned to best remove long-term 
sinusoidal components of the noise; whereas in the second 
stage, the target consists of shorter oscillatory transients. 



A. Subband decomposition 

Interferences such as mains power and violin mode har- 
monics are distributed over a large dynamic scale (the 
first harmonics are of much larger amplitude than those 
of high order). But, since the interferometer noise curve 
also decreases at low frequencies, their relative amplitude 
as compared to the background noise power spectrum at 
the same frequency remains large. Therefore, the model 
introduced previously, namely that of large amplitude 
sinusoidal signals embedded in broadband noise, is a rea- 
sonable approximation within the relevant small band- 
width of frequencies. 

For this reason, we divide the frequency axis in p dis- 
joint frequency subbands of the same size. The p signals 
lying in each of the subbands are heterodyned and deci- 
mated to the sampling frequency /]> and := f s /p. 

The tiling has the advantage that, if p is sufficiently 
large, we can consider the interferometer background 
noise almost white within a subband, which implies that 
the noise has vanishing correlation time. The prediction 
depth d which has to be larger than the correlation time, 
can be then simply fixed to any value greater than 1 sam- 
ple period in each of the subbands. 



B. Long-term sinusoid removal 

Certain parts of the spectrum may not contain any 
long-term periodic interferences. We apply a preliminary 
test to exclude subbands which may not require the first 
denoising step. The test is crudely done by estimating 
the amplitude A of the sinusoid from the largest peak 
of the power spectrum (Welch estimate) and comparing 
it to the variance a 2 of the broadband noise (also es- 
timated from the power spectrum). If it is found that 
A > a, we decide that there exists a long-term sinusoidal 
signal of sufficient amplitude in the band which needs to 
be removed, otherwise we proceed directly to the second 
step. 

We apply the ALE in each of the selected subbands 
choosing parameters as follows: 



Number of tap-weight coefficients N 

The number of tap-weight coefficients is fixed by 
prescribing an upperbound < r\ n oise < 1 to the 
ratio between the noise power corrupting the fil- 
tered output yk of the optimal filter w* and the 
input noise power. Let rik ■= (nk-d-m,iri = 
0, 1, . . . , N — 1)* be a collection of noise samples, 
then the above condition reads , 



noise E[(n fc ) 



(43) 



which, with the stationarity and whiteness of the 
background noise n&, results in bound on the opti- 
mal filter gain : 

w**w* < r] nolse . (44) 



TheL 2 norm \\w*\\j = (2/N 2 )(N+f3cos(~/+2d56\) 
is obtained by squaring and summing the Eq. ( |14| ) 
for the optimal filter. Since in typical cases (3 <C N, 
this leads to simpler expression ||tu*||2 ~ 2/N. 

Consequently, the number of tap- weight coefficients 
N has to be chosen so that, 



N > 2/ri„ 



(45) 



• Step gain parameter /j, 

We fix the step gain parameter by imposing to the 
distance of the ALE filter from optimality in the 
steady state to be smaller than a given thre shold 
on average. As we have seen in Sect. IIIC, this 
can be done naturally by imposing an upperbound 
< rjsig < 1 on the excess square mean error as 
compared to the signal power E s = A 2 /2 : 



t(st) 



< 



(46) 



Using the expression obtained in the steady state 
analysis in Eq. (|35|) for the EMSE, this condition 
reduces to : 



H < r) sig /(Na 2 ). 



(47) 



Generally, this equation leads to small values of /i 
which prevent the convergence of the ALE filter 
from its initial state (i.e., all tap- weight coefficients 
are fixed to 0) in a reasonable time (convergence 
faster than a tenth of second, which is the duration 
of the chunk of data) . We solve this problem by first 
applying the ALE on a sequence of training data, 
the step gain parameter being set at the beginning 
to a large value (for fast convergence) and decreased 
gradually to the value given in Eq. ©. The fil- 
ter obtained after the completion of this training is 
close to the objective (i.e., the Wiener filter). We 
then start the longterm sinusoid removal using this 
prepared filter. 
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We remark here that although p is small, it is non- 
zero thus giving the ALE filter some flexibility of 
adapting to changes (non-stationarities) in the sig- 
nal such as slow drifts in frequency and amplitude 
modulation. This property, however, needs to be 
investigated more in detail. 



C. Ringdown removal 

The aim of the second step of the algorithm is to re- 
move oscillatory transients (ringdowns) of large ampli- 
tude. These transients are either frequency bands ex- 
cited from time to time (caused by dysfunctions in the 
interferometer) or relics from the previous step (when 
the envelope of a long-term sinusoid possesses fast vari- 
ations to which the algorithm cannot adapt or converge 
to during the first step of removal). 

The cleaning procedure consists in applying ALE the 
second time to each of the subbands but now, the param- 
eters are so adjusted that, (i) they select features with a 
larger bandwidth than in the previous step, and (ii) con- 
verge rapidly onto an oscillatory noisy signal that may 
appear. 

• Number of tap-weight coefficients N 

The impulse response duration and frequency selec- 
tivity (i.e., the filter bandwidth A/) of the transfer 
function are dual in character. This follows from 
the uncertainty relation. The rough approximate 
relation between these quantities is given by, 



N = f s /(pAf), 



(48) 



where f s is the sampling frequency. We choose the 
number of tap- weight coefficients N by imposing a 
minimum bandwidth A/ TO ,* n to the filter and using 
the above equation. 

• Step gain parameter p 

Assuming that the ringdown can be locally approx- 
imated by a sinusoid, we choose the step gain pa- 
rameter by imposing a convergence time of the or- 
der of a typical transient duration (i.e., U oc k ~ NS). 
More concretely, setting p := pNA 2 /2 in the un- 
normalized form of Eq. ( p2| ) (i.e. , for arbitrary ring- 
down amplitude A), we solve for 



added a supervision test which decides whether or not 
the denoising algorithm should be applied to a given data 
segment. The test consists of observing the Gaussianity 
of the filtered output yk — w\xk. If the input signal 
Xk is a zero-mean white Gaussian process of variance a 2 , 
then the output of the filter yk shares the same charac- 
teristics, except that the variance gets multiplied by the 
filter gain : vary^ = ||«;fc|||<7 2 . Furthermore, under this 
hypothesis, the envelope 34 = \H(y)k\ 2 (Ti denotes the 
discrete Hilbert transform of ?/&) follows by definition 
a chi-square distribution with 1 degree of freedom. 

This implies that, up to an arbitrary probability Pq, 
the envelope 34 does not exceed the threshold given be- 
low: 



34 < K (P )2f b s and \\w k \\ 2 a 



(50) 



21n(l - p) 



= N. 



(49) 



where «;(•) is the inverse function of the (unit variance) 
X 2 cumulative distribution function (cdf). 

If Eq. j5Q ) is satisfied, we conclude that the filtered 
output is essentially due to a Gaussian background noise 
and we leave the input signal as it is. Otherwise, we con- 
clude that the filtered output carries a ringdown signal 
and decide to remove it from the input data. 

The functioning of the second step of the denoising 
algorithm could be interpreted as follows : it removes 
from the input data, regions in the time-frequency plane 
presumably associated with transients, whose support is 
defined along the frequency axis by the ALE filter, and 
along the time axis by the supervision criterion ( |50"| ) . 

After completing these two steps, we recombine the 
signal in all the subbands together to retrieve a single 
strain signal. 



V. NUMERICAL RESULTS 

A. Simulated data: test of the ringdown removal 

In this section, the goal is to test how effectively the 
second stage of the denoising algorithm (i.e., the ring- 
down removal) described in Sect. IV operates on a sim- 
ple signal. The test signal is composed of three ringdown 
signals (of fixed amplitude and frequency) occurring suc- 
cessively in the data stream and embedded in a additive 
Gaussian white noise. This model may be used to rep- 
resent ringdown disturbances originating from the same 
underlying physical mechanism. 



Using the crude estimate A 2 /2 « ||a3fe|||/-/V for the 
ringdown amplitude, the step gain parameter is fi- 
nally obtained as p = p/ll^felli- 

Since the ringdown signals are of short duration and 
can occur with large time gaps, the ALE does not need 
to operate on each data segment. Accordingly, we have 



3 The discrete Hilbert transform y n — H(x) n of a signal x n 
is essentially obtained by cancelling its negative frequencies; 
more precisely, Y(f) := 2U(f)X(f), with U(f) = 1 when 
/ G [0, 1/2] and when / €] - 1/2, 0[ and where X(f) (and 
Y(f)) denotes the Fourier transform of the corresponding sig- 
nalX(/)=£lo^- 2 ™'. 
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Each of these ringdown signals is a sinusoidal wave- 
form, similar to Eq. (|lfj| ) (with A = 1, /q = 50 Hz and 
sampling frequency f s — 200 Hz), whose support is lim- 
ited in time by a Gaussian envelope : 

r k := Aexp (-ir(t k - t c f /T 2 ) cos(27r/ i fe + $), (51) 

where three different reference times t c are given and the 
equivalent time duration is T = 200ms (giving a fre- 
quency bandwidth of A/ w 1/T = 5 Hz and Q := / T w 
10 cycles). 

Figure [| describes the application of the denoising al- 
gorithm configured with d = 5 sampling periods (equal 
to 25 ms) Af m i n = 3 Hz, and Po = 0.01. It can be 
seen that the algorithm operates better on the transient 
encountered later in the data train than its predecessor. 
The explanation is that a transient duration is too short 
for the filter to reach the steady state but, when it en- 
counters the next transient, the filter benefits from the 
distance to w* previously covered, thus improving the 
convergence towards optimality. 

This can be verified with a time-frequency representa- 
tion |l6| of the output signal such as Fig. 0, where we 
have chosen the spectrogram S^[n,m] :— \F£\n, m]\ 2 de- 
fined as the squared modulus of the short-time Fourier 
transform : 

2#[n, m] := £ x n h k - n e~^ inm , (52) 

k 

where n € [1,2, . . . ,N], m e] — 1/2. ..1/2] and ht is an 
arbitrary window (a Gaussian window here). 

Notice that real time and frequency coordinates can be 
retrieved through the relations : t = nj f s and / = mf s . 

B. Results on Caltech 40m proto-type data 

Here we have applied the algorithm to the Caltech 
40meter proto-type data taken in October 1994 |lT| . 
This data was recorded with a sampling frequency of 
f s = 9.86kHz. We have used the calibrated strain sig- 
nal Jl7t (relative arm length measurement) for applying 
our algorithm. 

We tile the complete spectrum into p = 32 frequency 
subbands of approximately 154 Hz each. Each subband 
encounters typically one or two long-term sinusoidal in- 
terferences. 

We have chosen the prediction depth to be d = 5 sam- 
pling periods, which corresponds to a delay of pd/ f s m 16 
ms in real time. The correlation time of the broadband 
noise is effectively smaller in each subband except at the 
extremities of the spectrum where the steep slope of the 
spectrum does not allow us to assume the background 
noise to be locally white. It only affects the first and 
last subbands which are not too important for detection 
purposes. 

In the first stage, we have chosen i] no i se = 0.01 (giving 
N = 200 according to Eq. @) and r] slg = 0.01. In the 



second stage of ringdown removal, the minimum filter 
bandwidth has been fixed to A/ m i„ = 3 Hz, which gives 
a filter with N = 100 tap- weight coefficients (see Eq. 
flls]) ) and we have set Pq — 0.01 for the Gaussianity test. 
We have performed two types of simulations: 

• a "Caltech signal only" simulation to measure 
improvements after denoising : we check firstly, 
whether the frequency peaks are removed from the 
noise power spectrum and secondly, whether the 
noise statistics is closer to Gaussian than before 
denoising, 

• a "Caltech+inspiral" simulation to evaluate the 
consequences of the denoising algorithm on gravi- 
tational wave detection; specifically, for the case of 
the inspiralling compact binary signal. The ques- 
tion here is to check whether the denoising opera- 
tion has removed a significant part or even whole 
of the inspiral signal. 

Caltech signal only — Eleven of the thirty-two fre- 
quency subbands (# 1-9, 11 and 17) are selected and 
sent to the first cleaning step of the algorithm. In these 
subbands, we obtained the following mean values for 
A » 1.5 x 10 -16 and a w 3.6 x 10~ 17 (the sinusoid am- 
plitude A equals approximately 1 to at most 5 times the 
noise standard deviation a) leading to typical values for 
the signal-to-noise ratio of about SNR = A 2 /(2a 2 ) 
8.7(9.4dB) and for the step gain parameter (see Eq. (]47|)) 
of ^NA 2 /2 k 0.04 (spanning from 0.01 to 0.14). 

The complete set of subband signals is processed in 
the second step. The typical noise variance estimate is 
a = 1.35 x 10" 17 (from 4.8 x 10~ 18 to 10~ 16 ) leading 
according to Eq. ( f49| ) to values of ixNa 2 which span the 
range of values from 0.07 to 10 -4 . 

Figure ^| illustrates how the algorithm operates in the 
fifth frequency subband (from 617 Hz to 771 Hz) among 
the p = 32 ones being processed. This frequency band 
contains two power line harmonics (the 11th at 660 Hz 
and the 12th at 720 Hz). 

Figures |^ and ^ show respectively comparisons between 
the power spectra and histograms of the signal before 
and after denoising. We observe that after denoising, the 
frequency peaks have been removed from the input signal 
and the histogram appears much closer to the Gaussian 
bell curve. 

Caltech signal + inspiral waveform — The purpose 
of this test is to evaluate how the cleaning operation af- 
fects gravitational wave detection and in particular to 
make sure whether a significant part of the gravitational 
signature could be removed from data. Answering this 
question by analytical means is difficult, however a quali- 
tative rational in the case of inspiral binaries can be made 
and verified with simulations. 

The theory predicts |l8| that the gravitational waves 
emitted from inspiralling binaries of neutron stars are 
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oscillating waveforms whose frequency evolves in time in 
a prescribed manner and scans the interferometer band- 
width from lower end to the higher. 

Their weak amplitude and short time duration within a 
single subband (in the case we have considered, less than 
a second) make them "invisible" to the ALE filter. The 
amplitude and the duration of the gravitational wave sig- 
nal are simply not large enough for the ALE coefficients 
to converge onto the gravitational wave instantaneous 
frequency. 

We have checked the validity of this argument by 
adding to the Caltech signal the inspiralling 'chirp' wave- 
form in the Newtonian approximation [ |l8[ of two neutron 
star binaries each having a mass of M — 1.4 solar masses, 
and located at a distance of r = 7kpc from the Earth. 

Figure |^ depicts a comparison of matched filter detec- 
tor response on the same signal with and without de- 
noising. The detector output displays a peak of the same 
height and at the correct instant, showing that the clean- 
ing algorithm has not removed the inspiral signal from 
the data. This can be crosschecked in Fig. || showing a 
zoomed view of the same signal after denoising. 



VI. CONCLUDING REMARKS 

The originality of the idea of the proposed denoising al- 
gorithm lies in its wide applicability, so that both types of 
disturbances, long-term sinusoidal and oscillatory tran- 
sients (the type of noise which has been ignored till now) 
can be treated. Although the question of the compu- 
tational burden in applying this algorithm has not quite 
been addressed here, it appears from the simplicity of the 
operations involved (e.g., no requirement such as long- 
term FFTs) that the total computational cost should 
be within acceptable limits, so that the algorithm can 
be operated in real time. Furthermore, the structure of 
the algorithm already implemented with Matlab (l9| can 
be easily translated into a parallel code (each processing 
node can be associated with one frequency subband and 
the processing can be done independently). 

As part of future extensions to the present work, some 
improvements to the current code might be needed : in 
order to limit the finite size effects in the subband de- 
composition and reconstruction, a reversible filter bank 
(e.g., a Gabor transform) would be preferable than the 
crude method used here. 

The key idea (i.e., looking for correlation between the 
current sample of the strain signal and a reference sig- 
nal, namely a set of past samples) can be also extended 
to investigate correlations of the detector output with 
other environmental channels by simply using them as a 
reference rather than the strain signal itself. Similarly to 
the cross-talk removal in [^0| but with adaptive methods, 
such an algorithm would provide an estimation of any 
poorly known (linear) transfer functions relating noise 
sources to their final leaking in the detector output and 



of the environmental contamination that must be sub- 
tracted from the data, if so desired. 
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FIG. 1. The figure illustrates the principle of the underlying method on which the algorithm we propose is based. The 
algorithm is designed to discriminate the nonstationary and non-Gaussian noise features from the broadband background noise 
in interferometric gravitational wave data. This method is referred to as LMS adaptive line enhancement and its objective is 
to compare the signal and its linear prediction, the predictor coefficients being adjusted by a feedback loop controlled by the 
prediction error. 
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FIG. 2. Applying the ALE to a sinusoidal signal: approach to locking and steady state. The figure illustrates 
how the ALE performs on a test signal composed of a sinusoid (/o = 50 Hz sampled at f s — 1 kHz) corrupted by white noise 
(SNR= A 2 /(2a 2 ) = 50(17dB)). We initialize the N = 40 tap-weight coefficients to 0, set the prediction depth d = 5 (ms) 
and the step gain parameter [i — 0.003. (a): the filter output signal yu (solid line) converges rapidly towards the actual noise 
free sinusoidal signal (dashed line), (b): this is confirmed by observing that the "signal" EMSE £| defined in Eq. (^) which 
decreases with time until it reaches its steady state value. For comparison, the horizontal dashed line indicates the theoretical 
mean value of the total EMSE £ s * in the steady state (see Eq. (|35[)). We can verify that the theoretical value obtained in 
Eq. ( fl2] ) for the convergence time ti oc k corresponds effectively to the time instant at which £| and £ st are of the same order. 
Finally, the two contour plots of the bottom line display the trajectory followed by the adaptive filter coefficients in the eigen 



weight space : in (c), the axis are the first two eigenvectors of R namely and v)^ 1 (i.e., the "signal" eigenvectors) whereas 



-.(1) 



in (d) the diagram plane is given by and vY' (i.e., a "signal" direction vs. a "noise" direction). As proved in Sect. IIIB, 
the weight coefficients converge more rapidly along the two directions 
largest eigenvalues. 



and given by eigenvectors associated with the 
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FIG. 3. Applying the ALE to a sinusoidal signal: convergence time. The figure shows the comparison between 
the convergence time t; oc fc obtained by simulations (solid line with '+' associated with one noise realization) and its theoretical 
value (solid line) given in Eq. (fi^). The test signal is a sinusoid in white Gaussian noise (see Eq. © with A = 1, fo = 50 
Hz, sampled at f a = 200 Hz). The convergence time is shown as a function of a 2 in (a) (we have fixed the remaining ALE 
parameters to N = 200 and [iN A 2 (2 = 0.01) and in (b) as a function of fj,NA 2 /2 (N = 200 and a 2 = 0.1). Simulations globally 
confirm the results obtained in Eq. (Q except when 1/2 < fiNA 2 /2 < 1. The reason for this discrepancy is that the difference 
between the actual gradient (Q) and its estimates (^) can be shown to be an oscillating term which does not average to zero any 
longer when the step gain parameter approaches the critical value for stability. Therefore, in practise, we choose parameters 
so that fiNA 2 /2 < 1/2. 



(a) input signal 



(b) 




^80 

j 40 
120- 



100r 



50 - 



2 4 6 

time (s) 

(d) adaptive filter 



10 



<ii ii I 



6 

time (s) 



10 



I 80 ■ 

S-60- 
§ 40 - 
^20 
- 



6 

time (s) 



10 



testin g the ringdown removal algorithm. The figure 
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FIG. 4. Applying the ALE to oscillating transients 

depicts the results of the transient removal algorithm presented in Sect. IV C to a signal in (a) composed of three successive 
oscillating bursts (see text for details) embedded in Gaussian white noise (SNR= A 2 /{2a 2 ) = 8(9dB)). In addition to the ALE, 
we measure the deviations of the filtered output from Gaussianity : when its envelope (c) exceeds some threshold (dashed 
line, see Eq. (^)), we decide that the filtered output is not normally distributed and, therefore contains a transient which 
has to be removed (this is indicated by dots at the top of the graph). The final net effect of the operation is that of time 
variant filtering of the input data. The corresponding transfer function is represented in (d) : the regions indicated with dark 
colours are parts of the time-frequency plane where the data are selected and removed from the input (i.e., it corresponds to 
the time- frequency "band" pass of the filter). Comparing the spectrograms [^J (see Eq. (^) for a definition) in (b) and (f) 
respectively of the input and output (e) signals, we observe that the three transients are progressively removed from the input 
(dark regions represent large values of the time- frequency energy density). 
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(a) input signal, band #5 
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FIG. 5. Illustration of the denoising procedure on Caltech proto-type data. In the subband #5 (between 617 
Hz and 771 Hz), the signal [[l7| (the data were taken on the October, 14th 1994, frame #2) in (a) contains two power line 
harmonics (at 660 Hz and 720 Hz), which we are seen as darkened horizontal lines in the spectrogram (see Eq. (f32|)) in (b). 
We apply ALE the first time to suppress long-term components (c) with corresponding spectrogram (d). A second run (g) 
supervised by the criterion ( |5c| ) detailed in (e) and (f) (see the similar plots in Fig. ^, (c) and (d) for explanation) eliminates 
artefacts of shorter duration (such as fast fluctuations in the harmonic envelope). Note that the spectrogram (h) of the final 
signal presents a homogeneous energy density both in time and frequency directions. 
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FIG. 6. "Caltech signal only" : comparison between power spectra of ALE input/output signals. The figure 
depicts power spectra of the Caltech 40 meter signal (top) in the operating frequency band, between 300 Hz and 1kHz and the 
same signal after denoising (bottom). 
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FIG. 7. "Caltech signal only" : comparison between histograms of ALE input /output signals. The probability 
density functions of the Caltech 40 meter signal selected between 300 Hz and 1kHz (left column) and the same signal after 
denoising (right column) have been estimated with histograms (top row). The bottom row shows the same histograms in special 
axes where a Gaussian bell curve appears as a straight line. 
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FIG. 8. "Caltech+inspiral" signal : matched filter response before and after denoising. The Newtonian 
approximation of a gravitational wave emitted from an inspiraling binary (each with mass 1.4 solar masses, at a distance of 7 
kpc and coalescence time fixed to t = 0) has been added to the Caltech interferometric proto-type data. Top row plots show 
this signal (a) and its corresponding version after denoising (b) which have been selected and whitened within the frequency 
band 200 Hz (i.e., the lower frequency bound of the observation window) to 1.3 kHz (i.e., the predicted frequency for the last 
stable circular orbit of the binary). The matched filter technique applied to detect the inspiral waveform, shows in both cases 
(e.g., without (c) and with (d) denoising) a peak at time t = in their detector responses (the normalization so chosen that 
"noise only" detector fluctuations are of unit variance) . 
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FIG. 9. "Caltech+inspiral" signal : zoomed view after denoising. As an additional check, this diagram presents a 
zoomed view at the coalescence time t = of the signal in Fig. ^-b (whitened signal after denoising). We have superimposed 
on it (in bold) the signal as it would appear in the noise free case. 
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